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Abstract 

Background: This paper examines the individual factors that influence prevalence rates of canine heartworm 
in the contiguous United States. A data set provided by the Companion Animal Parasite Council, which contains 
county-by-county results of over nine million heartworm tests conducted during 201 1 and 2012, is analyzed for 
predictive structure. The goal is to identify the factors that are important in predicting high canine heartworm 
prevalence rates. 

Methods: The factors considered in this study are those envisioned to impact whether a dog is likely to have 
heartworm. The factors include climate conditions (annual temperature, precipitation, and relative humidity), 
socio-economic conditions (population density, household income), local topography (surface water and forestation 
coverage, elevation), and vector presence (several mosquito species). A baseline heartworm prevalence map is 
constructed using estimated proportions of positive tests in each county of the United States. A smoothing algorithm 
is employed to remove localized small-scale variation and highlight large-scale structures of the prevalence rates. 
Logistic regression is used to identify significant factors for predicting heartworm prevalence. 

Results: All of the examined factors have power in predicting heartworm prevalence, including median household 
income, annual temperature, county elevation, and presence of the mosquitoes Aedes trivittatus, Aedes sierrensis and 
Culexquinquefosciotus. Interactions among factors also exist. 

Conclusions: The factors identified are significant in predicting heartworm prevalence. The factor list is likely 
incomplete due to data deficiencies. For example, coyotes and feral dogs are known reservoirs of heartworm 
infection. Unfortunately, no complete data of their populations were available. The regression model considered is 
currently being explored to forecast future values of heartworm prevalence. 
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Background 

The Companion Animal Parasite Council (CAPC) has 
compiled a data set of over nine million heartworm anti- 
gen tests performed on dogs in the United States dur- 
ing 2011 and 2012 [1]. These data are test results taken 
from dogs that visited veterinary clinics throughout the 
United States. From this data, our goal is to quantify 
the environmental, socio-economic, and vector factors 
that influence canine heartworm prevalence rates. Brown 
et al. [2] describe the data and list factors posited to 
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influence canine heartworm prevalence rates (these are 
discussed fully below). This paper quantitatively assesses 
which of the factors are important (or unimportant) and 
in what direction the factors impact (increase or decrease) 
heartworm prevalence rates. 

The data will be used to estimate the probability that 
a dog entering a veterinary clinic will test positive for 
heartworm. Although prevalence rates increase if a dog 
is from an area with a high heartworm transmission rate, 
the raw data do not describe transmission, but rather the 
risk that a dogs infection is detected if it enters a clinic 
within the United States and is tested for heartworm for 
any reason whatsoever. The most common reasons for 
testing a dog are: assessing the negative status of a dog 
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Figure 1 Raw reported heartworm prevalence rates for 201 1 and 201 2. 




Figure 2 Head-banging smoothed heartworm prevalence rates for 201 1 and 201 2. 
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before it begins heartworm prevention, annual testing of 
dogs on preventive prophylaxis to verify that the dog has 
been protected, and assessing whether or not a dog with 
clinical signs suggestive of heartworm disease is indeed 
infected. In many areas of the United States, dogs are kept 
on prophylaxis year round; however, in some areas, vet- 
erinarians utilize heartworm prevention seasonally. Here, 
annual testing verifies whether or not infection occurred 
during the period when preventives were not taken. To 
assess true transmission rates, it would be more appropri- 
ate to follow dogs or other canines, specifically coyotes, 
that are not receiving any form of prophylaxis [3,4] — 
these are not the canines studied here. According to the 
unpublished abstract of Pulaski et al. for the 58th Annual 
Meeting of the American Association of Veterinary Para- 
sitologists and the unpublished presentation of Blagburn 
et al. at The Triennial Symposium of the American 
Heartworm Society in 2013, in parts of the United States, 
resistance to heartworm preventatives has been recog- 
nized. Thus, future data may help detect preventive 
failure. 

A spatial logistic regression model will be fitted to 
the CAPC county-by-county heartworm test results and 



related to factor measurements. Logistic regression meth- 
ods are used in lieu of ordinary regression techniques 
because prevalence probabilities, which must lie in the 
interval [0, 1], are being modeled. A significant technical 
challenge involves the large number of counties report- 
ing a small number of tests (often this count is zero). 
Small sample sizes from isolated counties can adversely 
impact results if not properly handled. Therefore, meth- 
ods are developed that account for sample size issues. 
The head-banging algorithm, a method for smoothing the 
county-by-county prevalence rates, will be used to extract 
general spatial structure in the prevalence estimates; this 
procedure is adept at dealing with outlying observations 
and boundary (edge) features. 

Our results are useful in a variety of contexts. First 
and foremost, predicting heartworm prevalence rates 
alerts the pet owner to high-risk areas. This will be evi- 
dent from the baseline risk maps constructed in Section 
"Construction of the baseline heartworm prevalence 
map". Second, pinpointing the factors accompanying high 
heartworm prevalence rates provides an opportunity to 
target those factors in mosquito and heartworm con- 
trol programs. Third, our results provide a quantitative 





Figure 3 201 1 U.S. annual average temperature (Degrees F). The temperature data included in this study were annual in nature and were 
aggregated by the National Climatic Data Center (NCDC) [7] by climate region [8]. These data are not county-by-county — all counties within a 
climate region are assigned the same annual temperature. For example, the state of Alabama has 67 counties and 8 climate regions. Annual 
temperatures for 201 1 were used to generate this graphic. Temperature dependence on latitude is clear. 
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analysis of canine heartworm across the entire country, 
allowing us to confirm that cases do occur in the Western 
United States. Finally, the fitted regression model can be 
used to forecast future prevalence levels of heartworm or 
its response to climate change. 

Methods 

The data and factors 

This section describes the data provided by CAPC and 
lists the factors considered by our analysis. 

The raw test results were supplied by the Antech and 
IDEXX corporations [5,6] and report whether each per- 
formed test was positive or negative — no uncertainty 
margin is supplied for the results. While individual tests 
are reported by zip-code of the testing clinic, the raw data 
were aggregated into the number of positive tests and 
the total number of tests conducted over each calendar 
year in each county of the conterminous United States. 
Since only two calendar years of records are available, no 
attempt is made to include seasonal structure. The IDEXX 
samples represent both the results of pet-side heartworm 
antigen test kit and results from an IDEXX capture sys- 
tem [Heartworm RT and the 4Dx Plus (and originally 
4Dx tests)] along with tests run by the IDEXX diagnostic 



laboratories [Heartworm Antigen by ELISA-Canine*]. 
Antech tests were performed at Antech Laboratories and 
utilized the Dirochek Assay and the AccuPlex4 heartworm 
antigen detection assay. Over 2011 and 2012, there were 
9,580,719 total tests performed by either method, of which 
111,259 were positive. Rudimentary statistical checks do 
not show vast differences between Antech and IDEXX 
samples. 

In most of the southeastern United States, veterinari- 
ans assume that outdoor dogs are at risk of heartworm 
infection, and thus, recommend that clients place their 
dogs on preventive protection. This may not be practi- 
cal for all pet owners due to costs. In the CAPC data, 
many counties in the eastern United States report a small 
number (say less than 20) of tests. This is likely because 
tests are not being reported or are incommensurate with 
CAPC protocols, not because they are uncommonly per- 
formed. Tests from such counties are likely performed for 
the same reasons as other southeastern counties report- 
ing a greater number of tests. In other areas of the United 
States, such as Montana or Idaho, testing is likely only 
performed if dogs have signs suggestive of heartworm 
disease and the veterinarian requires a confirmative test. 
Sometimes, veterinarians are aware that dogs travel to 




Figure 4 201 1 U.S. annual total precipitation (Inches). The precipitation data were also obtained from the NCDC and has the same spatial 
resolution as the temperature data. Data used in this figure are for 201 1 . One sees a relatively dry Southwestern United States and higher 
precipitation in the southeastern United States. 
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heartworm endemic areas for part of the year. These 
dogs may be tested annually when they return to their 
home state. Also, as is evident from Figures 1 and 2, it 
is now fairly obvious that heartworm occurs in much of 
the Western United States. In some of these areas, heart- 
worm testing is probably conducted for the same reasons 
as more prevalent areas. Overall, while it is understood 
that there may be some sampling biases in certain areas 
of the United States, the CAPC data seems fairly reflec- 
tive of a true random sample for many counties in the 
United States. 

Other data aspects are worth illuminating. First, heart- 
worm infections are not detectable by almost all testing 
methods until about 6 or 7 months after the dogs have 
become infected; this is how long it takes for either 
microfilariae or antigen to appear in the blood after infec- 
tion. Thus, many of the detected infections likely com- 
menced the year prior to a positive test result. Because 
no travel information exists, it cannot be known where a 



dog acquires infection. However, it is suspected that the 
majority of infected dogs were infected close to home. 
Second, it is not known if a dog has been tested more 
than once. Dogs may be tested more than once annu- 
ally to verify the need for treatment, to verify successful 
treatment, or annually tested after the infection is first 
identified. 

The factors chosen for inclusion in this study are 
those envisioned to impact whether a dog is likely to 
have heartworm. These factors are a subset of those 
listed in Brown et al. [2] and contain climate variables 
(annual temperature, precipitation, and relative humid- 
ity; Figures 3, 4, 5); geographic factors (elevation, forest 
coverage, surface water coverage; Figures 6, 7, 8); soci- 
etal factors (human population density and household 
income; Figures 9 and 10); and the presence or absence 
of Aedes aegypti, Aedes albopictus and six other mosquito 
species (Figures 11, 12, 13, 14, 15, 16, 17, 18). Presence or 
absence of mosquito species was used because abundance 
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Figure 5 201 1 U.S. annual relative humidities (Percent). Our humidity data were relative (and not absolute) humidities. Relative humidity is not 
measured directly, but can be estimated from temperature and dew point (dew point is measured) via 

exo( 17 - 27 ( p ~ 32 > 5 / 9 ^ 

^ \ (0-32)5/9+237.3 

RH = y — f 1 00%, 

/j7 ; 27(r-32)5/9_\ 
r 1 ^(7-32)5/9+237.3 ) 

where T is annual temperature, D is annual dew point, and exp is the exponential function [9]. As expected, the Southeast is the most humid region 
in the United States. 
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Figure 6 U.S. county elevations (Feet). Elevation data were obtained on a county-by-county [1 0] basis, with height of the highest point in each 
county used to produce this figure. Data containing average county elevations would be preferable to use but were not readily available. Of course, 
any biases should be minor in the eastern United States where most counties are homogeneous in elevation. However, the western States are less 
homogeneous. For example, Inyo County in California contains both the highest (Mount Whitney, 14,505 ft.) and lowest points (Death Valley, -282 ft.) 
in the conterminous United States. These limitations aside, elevation is a potentially important factor for heartworm prevalence as higher elevations 
are often associated with drier conditions. 



data are not available. Table 1 lists all considered fac- 
tors. Many (but not all) of the factors are available on 
a county-by-county basis across the United States. Our 
methods do not a priori assume that all factors signifi- 
cantly influence heartworm prevalence rates, but rather 
seek to determine which factors significantly influence 
prevalence. 

Construction of the baseline heartworm prevalence map 

We first construct a baseline heartworm prevalence map. 
This is done on an annual basis as there is too little data 
to consider seasonal effects. As will become apparent, this 
analysis is a necessary precursor to assess factor impor- 
tance — informative factors should be able to reproduce 
the structure of our baseline prevalence map. For the 
years 2011 and 2012, all data were combined into a single 
sample. 

For a county s, let p(s) denote the probability that a sin- 
gle dog tests heartworm positive. For notation, n(s) is the 
number of tests in county 5 and k(s) is the number of 
positive tests at county s. For example, if county s has 3 



positive tests out of 100 during 2011 and eight positive 
tests out of 200 during 2012, then k(s) = 11, n(s) = 300, 
and p(s) = 11/300 (a hat over a quantity indicates it 
is an estimate). Figure 1 displays county-by-county val- 
ues of p(s). This figure indicates that heartworm is most 
problematic in the Lower Mississippi Valley. The role of 
factors in explaining the prevalence rates will be discussed 
in Section "Factor quantification". No factors are involved 
in the calculation of p(s). 

Since the number of dogs tested in distinct counties 
greatly varies, the raw values of p(s) need to be weighted. 
Estimated values of p(s) are more accurate for a sample 
of 100 dogs than for a sample of 10 dogs. To quantify 
this, the classical standard error is used. In particular, the 
estimated variance of pis) is 

Var(p(s)) = tt . (1) 

n(s) 

The estimated standard error of p(s) is the square root of 
(1). We weight the values of p(s) inversely proportional to 
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Figure 7 2007 U.S. county forestion coverage (Percent). County forest coverage was obtained from the United States Department of Agriculture 
(USDA) [1 1]. Total county area was obtained from the Census Bureau [12]. Percentages of forest coverage were calculated by dividing the forest 
coverage area by county area (times 100%). The definition of forest coverage here is restricted to agriculture woodland, which means land 
supporting trees capable of producing timber or other wood products, including but not limited to logs, lumber, posts, and firewood. This data are 
updated by the USDA every five years. However, the 201 2 data is not available yet; hence, the 2007 data were used to generate the graph. 



this standard error. Before doing this, an adjustment was 
made to the values of p(s) for small sample sizes. 

Technically, counties where all tests are positive or neg- 
ative have Var (p(s)) = 0 (hence, its reciprocal is infinite), 
which could adversely impact our ensuing smoothing 
methods. To combat this, the Wilson estimator that adds 
two to numerator counts and four to denominator counts 
is used in lieu of p(s): 



This estimator has desirable sampling properties and can- 
not be zero or unity [18]. Of course, for large k(s) and n(s), 
p(s) and pw(s) are approximately equivalent. 

The raw values of pw( s ) greatly vary across counties. 
Even counties in close proximity to one another often 
have highly different prevalence estimates. This said, some 
spatial structure clearly exists in Figure 1. Our next goal 
is to extract and explore this structure. To accomplish 
this, the weighted head-banging spatial smoothing algo- 
rithm was applied to the county-by-county values of 
pw(s)- This procedure serves to remove localized small- 



scale variations due to random chance, illuminating large- 
scale structures that are actual features of the prevalence 
rates. 

While we will not delve into the details of weighted 
head-banging procedures, the technique is a median- 
based algorithm proposed by Tukey and Tukey [19] for 
smoothing spatial data. One needs to input the longitude 
and latitude of the centroid of each county, the county 
value to be smoothed, and the corresponding weights. 
Mungiole et al. [20] discuss the algorithm in detail. Head- 
banging takes its name from a child's game, where the 
child presses their face against pins protruding from a 
board that are of various lengths. The result leaves an 
impression of the child's face, smoothing the lengths of 
adjacent nails but leaving the general structure of the faces 
impression. Head-banging techniques are very effective 
for down-weighting or removing noisy spikes' while pre- 
serving edge structures. A spike is an isolated observation 
that lacks confirmation from nearby data. Because of 
different testing practices from county to county, many 
spikes exist in the heartworm prevalence estimates. An 
edge occurs where data changes significantly in pattern — 
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Figure 8 2007 U.S. county water coverage (Percent). County surface water coverage was obtained from the Census Bureau [1 2] and was 
calculated by dividing the surface water area by total county area reported in the Census Bureau [1 2]. The surface water coverage data were last 
updated in 201 1 , which were used in our analysis. 



perhaps due to a mountain range. Edges are informative 
as they often demarcate distinct data regions. 

To run the weighted head-banging algorithm, a param- 
eter called the number of triples must be selected and the 
weights need to be specified. At each county where data 
is present, a set of triples (a triple for a county is repre- 
sented by the county itself and two nearby counties) were 
selected based on the criteria proposed by Hansen [21]. 
The weight of county s, denoted by w(s), comes from the 
inverse of the standard error of p(s), with pw( s ) replacing 
Pis): 



w(s) 



n(s) 



£(s)+2 
\J n(s)+4 



k(s)+2 \ 
«(s)+4 / 



Figure 2 shows our smoothed prevalence rates based 
on the weighted head-banging procedure. The larger the 
triple parameter is, the smoother (less rough) the resulting 
map will be. We have intentionally left the graphic slightly 
under-smoothed. This is because it is easy to visually 
smooth variabilities away with the eye, but impossible to 
recover true fluctuations that are erroneously smoothed 
away. Thirty triples were used to produce this graphic. 



Figure 2 has interesting implications. First and fore- 
most, heartworm is most prevalent in the Lower 
Mississippi Valley. While the northern latitudes show 
less activity, places where the prevalence rates were rela- 
tively higher do exist. Michigan, Vermont, and Northwest 
Washington, for example, show greater heartworm dis- 
ease prevalence than some of the other states at the 
same latitude. The Northern Rockies perhaps show the 
least heartworm disease prevalence. While many infer- 
ences can be made from Figure 2, we caution the reader 
not to over-interpret minutia. The map is constructed 
from only two years of data and there are variations in 
the results that may be spurious. For example, two very 
close locations — say Baton Rouge and New Orleans, 
Louisiana — might be shaded different colors on the 
map, but should not be expected to have radically dif- 
ferent prevalence rates. As additional years of data are 
collected, we expect our baseline to become more accu- 
rate. Another issue involves dispersal of the disease: there 
is no a priori reason to think that prevalence rates are 
static in time. With only two years of observations, time 
trends will be difficult to discern and are not explored 
herein. 
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Factor quantification 

This section examines the significance of the individ- 
ual factors presented in Table 1 in predicting heartworm 
prevalence. A logistic regression model was created using 
data from 2011 and 2012. Logistic regression methods (as 
opposed to ordinary regression methods) [22] are specif- 
ically designed for cases involving a binary outcome that 
can be summarized by a probability, and hence limited to 
take values in the interval [0, 1]. Our goal is to reproduce 
the structure in Figure 2. 

Let X(s) = (/i(5), . . . ,/ 8 ( 5 ); li(5), . . . , 1 8 (5)) / be the col- 
lection of all predictive factors at county s. The logistic 
regression model attempts to explain spatial variations in 
p(s) from the factors via 



p(s) = 



1 + e g(X(s)) 1 

where g(X (5)) has form 

g(X(s)) = logics)) = In 



(2) 



p(s) 



l-p(s) 

8 8 



(3) 



Clarifying terms, In denotes natural logarithm and 
logitOO = lnO) - ln(l - x). Notice that £# (X(5)) /(1 + 
ggixis))^ G j-Q ? -|j f or an y va i ue 0 f g(X(s)). This guaran- 
tees that all predicted prevalence rates lie between zero 
and unity. The overall location parameter, /3o, is common 
to all counties while Pi,...,Pg, are regression coefficients 
for the eight non-mosquito factors, and yi,...,Ys> are 
regression coefficients for the eight mosquito species. The 
notation 1/(5), 1 < i < 8, are zero-one indicators: 1/(5) is 
taken as unity if the /th mosquito type is present in county 
5 and zero otherwise. 

To estimate the parameters /3n, f$» 1 < / < 8, and 
Yu 1 < i 5 8, from the data, the classical method of max- 
imum likelihood [23] is used. Once the logistic regression 
parameters are estimated, an estimate of ^(5) based upon 
the fitted model is computed via 



^Logistic (s) = 



fjg(X(s)) 



1 + e g(X(s)) 1 

where quantities in (4) are estimated by 

8 8 

g(x(s)) = po + J2 to® + E ft 1 *®- 



(4) 



(5) 



i=l 



i=l 
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Figure 10 201 1 U.S. median household income (Dollars). Median household incomes were obtained from the Census Bureau (201 0 census). 
These data were adjusted for inflation based on 201 0 dollars; the Census Bureau adjusts by multiplying 201 1 median household income by the ratio 
of the Consumer Price Index of 2010 and 201 1 [14]. 




Figure 1 1 Presence of Aedes aegypti. 
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| | Absence 

| | Presence 

Figure 12 Presence of Aedes albopictus. 



Results and discussion 

Figure 19 shows the results from a fitted logistic regres- 
sion model after smoothing the ^Logistic (s) estimates with 
the weighted head-banging algorithm with 30 triples. The 
results reproduce the rough structure of Figure 2. The 
overall implication is that canine heartworm seems to be 
reasonably quantifiable. 



Clarifying details, in the head-banging procedure, the 
inverse of the standard deviation of ^Logistic (s) is used as 
the weight. For any factor, if both 2011 and 2012 data are 
available (such as temperature and precipitation) then the 
average of the observations from the two years is used in 
equation 5. Otherwise, the most recent measurement is 
used. 
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Figure 14 Presence of Aedes sierrensis. 



The individual factors and their predictive significance 
are worth discussing. Table 2 lists estimates of each logis- 
tic model parameter, standard errors, odds ratios, and 
95% confidence intervals for the odds ratios. Elaborat- 
ing, the standard error is the square root of the estimated 
variability of the regression coefficient. Smaller standard 
errors imply greater precision for the parameter estimate. 



All ^-values for the hypothesis test that the parameter is 
zero are less than 0.0001. If the parameter for a factor 
is in truth zero, it is not a good predictor of prevalence 
rates. A positive/negative parameter estimator means that 
the corresponding factor is judged to increase/decrease 
heartworm prevalance as the factor increases. Since the 
mosquito factors are only presence/absence indicators, a 





Figure 15 Presence of Aedes trivittatus. 
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Figure 16 Presence of Anopheles punctipennis. 



positive regression coefficient estimate means that preva- 
lence is higher when the mosquito species is present, and 
lower when absent. Exp(/3) is interpreted as the odds of 
a dog being heartworm positive relative to the odds of 
the dog being negative when the corresponding predic- 
tor increases by one unit. For mosquito factors, Exp(^) 



is interpreted directly as the odds of a dog testing heart- 
worm positive when the corresponding mosquito species 
is present relative to the odds of the dog testing neg- 
ative when this mosquito species is absent. Hence, if 
Exp(^) is larger than unity, then heartworm prevalence 
is increased by the corresponding factor, and vice versa. 





Figure 17 Presence of Anopheles quadrimaculatus. 
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Figure 18 Presence of Culex quinquefasciatus. 



The units for the elevation regression coefficient esti- 
mator are in thousands of feet; the estimated coefficient 
for median household income is in thousands of 
dollars. 

The model results in a fairly good fit: McFaddens 
pseudo R 2 = 0.62 [24]. All predictors were signifi- 
cant predictors of heartworm prevalence rates at level 
0.0001. Higher temperatures are associated with higher 
prevalence - heartworm is generally more prevalent in 



warmer temperatures. Higher median household incomes 
are associated with lower prevalence rates. Logically, 
higher income pet owners can more easily afford heart- 
worm preventives. Not surprisingly, heartworm preva- 
lence decreases with increasing population density and 
elevation. It is surprising to see that prevalence declines 
with higher humidities. One explanation for this may 
lie with the high correlations among elevation, relative 
humidity, and temperature. This will be explored further 



Table 1 Heartworm factors considered for inclusion in the study 





Factors 


Data available period 


Scale 


Source 




Annual temperature 


2011 and 2012 


Division 


National Climate Data Center (NCDC) 


Climate factors 


Annual precipitation 


2011 and 2012 


Division 


NCDC 




Annual relative humidity 


2011 and 2012 


Station 


NCDC 




Elevation 


2012 


County 


http://www.cohp.org/ 


Geographic factors 


Percentage forest coverage 


2007 


County 


United States Department of Agriculture (USDA) 




Percentage surface water coverage 


2010 


County 


U.S. Census Bureau 


Societal factors 


Population density 


2010 


County 


U.S. Census Bureau 


Median household income 


2011 


County 


U.S. Census Bureau 




Aedes oegypti 


2008 


County 


Moore, CG. [15] 




Aedes albopictus 


2012 


County 


HynesNA[16] 




Aedes canadensis 


2004 


County 


RF Darsiejr.and RAWard [17] 


Mosquito species 


Aedes sierrensis 


2004 


County 


RF Darsiejr.and RA Ward 


Aedes trivittatus 


2004 


County 


RF Darsiejr.and RA Ward 




Anopheles punctipennis 


2004 


County 


RF Darsiejr.and RA Ward 




Anopheles quadrimaculatus 


2004 


County 


RF Darsiejr.and RA Ward 




Culex quinquefasciatus 


2004 


County 


RF Darsiejr.and RA Ward 
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Number of Triples: 30 
Figure 19 Predicted heartworm prevalence from all factors. 



CAPC 



below when a model that allows the factors to interact is 
examined. 

Presence of Culex quinquefasciatus, Aedes sierrensis, 
Anopheles punctipennis, Anopheles quadrimaculatus, and 
Aedes canadensis are associated with higher heartworm 



prevalence. The other three mosquito species are esti- 
mated to decrease prevalence rates. The reader should not 
a priori believe that presence of any mosquito species acts 
to increase prevalences in the model: the 16 factors are 
being "judged" in tandem. 



Table 2 Significance of factors 



Effect 


Estimate 


Standard error 


Exp(0) 


95%Wald CI 


Intercept 


-8.0610 


0.0819 






Temperature 


0.0720 


0.0011 


1.075 


(1.072, 1.077) 


Median household income 


-0.0227 


0.0003 


0.978 


(0.977, 0.978) 


Population density 


-0.0175 


0.0004 


0.983 


(0.982, 0.984) 


Precipitation 


0.0982 


0.0049 


1.103 


(1.093, 1.114) 


Elevation 


-0.0605 


0.0030 


0.941 


(0.936, 0.947) 


Relative humidity 


-0.0147 


0.0008 


0.985 


(0.984, 0.987) 


Forest coverage 


1 .2493 


0.1156 


3.448 


(2.781,4.375) 


Surface water coverage 


0.1054 


0.0265 


1.111 


(1.055, 1.170) 


Aedes trivittotus 


0.9987 


0.0153 


2.715 


(2.634, 2.797) 


Culex quinquefasciatus 


0.4063 


0.0152 


1.501 


(1.457, 1.547) 


Aedes sierrensis 


0.6947 


0.0293 


2.003 


(1.892, 2.121) 


Anopheles punctipennis 


0.4132 


0.0182 


1.512 


(1.459, 1.567) 


Anopheles quadrimaculatus 


-0.1663 


0.0185 


0.847 


(0.817, 0.878) 


Aedes aegypti 


-0.1232 


0.0112 


0.884 


(0.865, 0.904) 


Aedes canadensis 


0.1481 


0.0143 


1.160 


(1.128, 1.192) 


Aedes albopictus 


-0.0894 


0.0112 


0.915 


(0.895, 0.935) 


**AII factors are significant at level 0.01 . 
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It is perhaps remarkable that every factor considered 
is judged to influence prevalence rates with a signifi- 
cance level of 0.0001. It leaves an unsettling feeling that 
other important factors may have been omitted. We reit- 
erate that this study is exploratory and will hopefully 
be improved in the future. We should also mention the 
usual regression caveat: (3) presupposes a linear relation- 
ship on the factors that may not be realistic at all factor 
levels. This drawback is not germane to this study; lin- 
ear regression models, of course, serve as rudimentary 
guidance. 

The above results, often called a main effects analy- 
sis, can be improved by adding interaction terms into the 
regression. An interaction term between factors i and ; 
adds an additional regression term of form fi(s) fi(s) 
into equation 3. Table 3 reports a logistic regression model 
fit with most non-mosquito factors allowed to pairwise 
interact. The notation Temperature*Elevation, for exam- 
ple, refers to temperature and elevation interaction. There 
are (Sj) = 28 possible interacting pairs. However, only 16 
interaction pairs were considered due to practical con- 
straints. For example, interaction between elevation and 
temperature is plausible since heartworm prevalence at 
higher elevations may differ according to temperature. 
However, median household income and temperature 
could not be sensibly allowed to interact since heart- 
worm prevalence for dog owners with high salaries does 
not depend on the temperature where he/she lives, and 
vice-versa. The mosquito factors were not allowed to 
interact with other factors because they are simply pres- 
ence/absence variables. All insignificant factors and inter- 
actions were eliminated at the 5% level with a standard 
backward elimination regression procedure [25]. Clari- 
fying, we first fitted a model with all individual factors 
and 16 interactions. The term with the largest p- value in 
the regression was eliminated if its p-vdlue exceeded 0.05 
and the model was refitted. This procedure was repeated 
until all insignificant factors were eliminated at the 
5% level. 

Including interactions increased McFadden s pseudo R 2 
to 0.65. Table 3 summarizes the results of this procedure. 
In this table, Exp(^) and their confidence intervals were 
not included. This is because when factor interactions are 
included, the value of Exp(/3) depends not only on the 
coefficient of this factor, but also on all the other factors 
that interact with the factor. All listed factors are signifi- 
cant with a significance level of 0.0001, except the surface 
water coverage*relative humidity interaction (^-value of 
0.0002) and surface water coverage (^-value of 0.0327). 
Differences from the Table 2 results exist, but are not rad- 
ical. Relative humidity, for example, is not itself significant 
but interacts with several other factors. All other individ- 
ual factors remain significant predictors. The parameter 
estimates of the main effects are slightly different from 



Table 3 Significance factors and interactions* 



Effect 


Estimate 


J LCI 1 1 vl CI 1 KA 

error 


Intercept 


-8.4312 


0.0993 


Median household income 


-0.0267 


0.0004 


Temperature 


0.0655 


0.0015 


Elevation 


0.5609 


0.0171 


Population density 


n r\A a a 
-U.U444 


0.001 5 


Temperature*Elevation 


-U.UUo/ 


U.UUUo 


Temperature*Forest coverage 


U.4o4U 


U.U 1 yj 


Temperature*Surface water coverage 


-U.Uo4o 


U.UUo4 


Elevation*Relative humidity 


-0.0041 


0.0002 


Forest coverage 


"jO.jj / I 


~) 1 1 07 

z. I I z/ 


Population density*Median household 


0.0005 


C\C\C\ 1 

<U.UUU I 


income 






Forest coverage*Surface water coverage 


-18.1276 


1.5896 


Precipitation 


0.3049 


0.0278 


Precipitation*Surface water coverage 


-0.3247 


0.0299 


Precipitation*Elevation 


0.0196 


0.0020 


Precipitation*Relative humidity 


-0.0032 


0.0004 


Elevation*Surface water coverage 


0.2472 


0.0342 


Elevation*Woodland coverage 


0.7325 


0.1019 


Forest coverage*Relative humidity 


0.0903 


0.0234 


Surface water coverage*Relative humidity 


0.0281 


0.0075 


Surface water coverage 


1.1773 


0.5507 


Aedes trivittatus 


1 .0682 


0.0158 


Aedes sierrensis 


1.1311 


0.0313 


Culex quinquefasciatus 


0.5483 


0.0163 


Anopheles punctipennis 


0.3948 


0.0191 


Aedes canadensis 


0.1567 


0.0151 


Aedes albopictus 


-0.0976 


0.0116 


Aedes aegypti 


-0.0879 


0.0126 


Anopheles quadrimaculatus 


-0.1084 


0.0163 



* Al I factors and their interactions are significant at level 0.01 , except for that 
surface water coverage is significant at level 0.05. 



those in Table 2 since interactions are now included. For 
example, the parameter estimates of both median house- 
hold income and population density are smaller than 
those in Table 2 and their interaction is significant. The 
significant interaction indicates that the changes of heart- 
worm prevalence as median household income changes 
are conditional on the value of population density, and 
vice versa. 

Figure 20 presents an analogous graphic to Figure 19 
when interactions were included in the logistic regression 
model and predictions are made from the fitted model 
in Table 3. Results allow for interactions and a backward 
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selection was used to eliminate any insignificant factors 
and their interactions. The results are comparable to those 
in Figure 19. 

Conclusions 

Our factor list in Table 1 is likely incomplete; however, 
it should serve as a starting point. Several important fac- 
tors that were discussed in Brown et al. [2] could not 
be included in the analysis due to data deficiencies. For 
example, coyotes and feral dogs are known reservoirs of 
heartworm infection. Unfortunately, no complete data of 
their populations were available. Additionally, mosquito 
vector abundance data were unavailable. Should informa- 
tion regarding coyote counts and vector abundance be 
consistently collected in the future and incorporated in 
the methods discussed herein, the model would improve 
in power and practical implications. 

Some of the factors could be refined. For example, it is 
frequently posited that heartworm degree units for tem- 
perature likely influence prevalence rates [26]. Such units 
are typically measured from 14 degree Celsius (57 degrees 
Fahrenheit), below which it is difficult to have high heart- 
worm transmission. We have not looked at temperature 
departures above 14 degree Celsius because our data are 
annual. Also, subtracting 14 from the annual tempera- 
tures would reparametrize the value of fa, but would not 
change the overall regression fit. 

The fitted regression model is currently being explored 
to forecast future values of heartworm prevalence. Indeed, 
many of the predictive factors vary with time (temperature 



and precipitation, for example). From a forecast of these 
factors — say a year in advance — and our fitted logis- 
tic regression model, predictions of prevalence rates can 
be made. This can inform the pet owner and practitioner 
in advance of a potentially bad heartworm season. The 
results can also be used to assess prevalence rate changes 
due to climate change. For example, if annual tempera- 
tures are expected to increase by one degree F, one could 
add one degree to the temperature in the fitted logistic 
regression model to predict the change. 
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